Interactive, Geospatial Visualizations of Federal Data Using Bokeh and Geopandas

By: Matt Ring

Purpose: Create interactive, geospatial visualizations with minimal python knowledge and packages. Visualizations will be built incrementally, culminating in a final visualization containing most of the tools describe beforehand.

NOTE: Before getting started, extract all files from the .zip file and add to the data folder. You can find the gihub repository for my work here.

1. Install Packages

If you're working in Anaconda, you'll only need to install geopandas. Uncomment the lines starting in "!" to ensure you have all necessary packages.



If you're not working on Anaconda, you'll likely need to install a few more packages. Uncomment the lines starting with "!" below to ensure you have all necessary packages installed.



2. Loading Data & Packages

Packages:

Data: County-level statistics for the DC, Maryland, and Virginia Area.



These data are all publically available from a range of US federal agencies, including the Census, Centers for Medicare and Medicaid, and the Bureau of Transportation Statistics. A codebook for the features can be found in the data folder, which includes a written description of each variable and the agency from which they were gathered. For this tutorial, we will be using the inequality index from the Census's 2019 5-year American Community Survey.


Now we'll import the Census's county shapefiles.


Let's subset to DC, Maryland, and Virginia (FIPS 11, 24, and 51). Further analysis on the rest of the United States is possible but requires more computing power.



Finally, we'll merge the county shapes to the federal data, dropping GEOID and year. ***NOTE:*** The shapefile MUST be on the left in the merge.



2. Basic Visualization

Goal: Create and fine-tune a rudamentary Bokeh visualization. Basic visualizations include interactive figures that allow for scrolling, zooming, saving, and resetting.



Shapes & Colors

Here, two plots will be made:
  1. Colored by state (a categorical variable)
  2. Colored by inequality (a continuous variable)


Additional Tools

Goal: Explore more advanced tools in Bokeh, including components that allow one to interact with the data itself and link to other geographic information. Tools:

A. Hover & Tap



B. Slider

NOTE: Coordinates are recorded in Web Mercator coordinates, WGS84 (EPSG 3857). Points were originally in latitude and longitude, but were converted to WGS84 direcly after importing so they would work with the background map in section C. Here, Web Mercator coordinates are used to create a x and y range for DC, Maryland, and Virginia.



C. Background Map



4. Saving and Sharing

Goal: Storing and presenting interactive visualizations.

A. Individual Plots

Individual plots may be stored as static images or interactive .html files. NOTE: Additional packages, such as selenium, must be installed to export the visualizations as static images.



B. Full Documents

To save an entire document of interactive visualizations, the simplest way is to create and export them from Jupyter Notebook. To do so, go to File -> Export Notebook As... -> HTML. NOTE: Exporting a notebook with markdown cells will cause the Bokeh visualizations to not display. Likewise, exporting to html.slides can sometimes cause visualization display issues as well.

5. Other Tutorials

For more tutorials, I highly recommend the following tutorials and resources, many of which were used to guide the construction of this tutorial: